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1.  Introduction 

Statistical  spectral  analysis  has  several  roles  in  time  series 
analysis:  (i)  estimation;  (ii)  hypothesis  testing  and  hypothesis 
suggesting;  and  (iii)  description  and  reduction  of  data. 

In  any  field  where  the  properties  of  the  phenomenon  being  studied 
can  be  characterized  in  terms  of  its  behavior  in  the  frequency  domain 
one  needs  to  estimate  spectral  density  functions  and  other  spectral 
characteristics  associated  with  stationary  multiple  time  series. 

However,  spectral  analytic  techniques  seem  to  provide  also  a 
means  of  testing  the  fit  of  various  models  (the  goodness-of-fit  of  a 
model  can  be  discussed  using  sample  spectra  of  the  residuals  from  the 
fitted  model)  and  suggesting  possible  models  to  fit  (explanatory 
"variables"  or  "mechanisms"  to  be  fitted  to  a  time  series  are  often 
suggested  by  sample  spectra) .  As  stated  so  lucidly  by  Herman  Wold 
(1947);  "empirical  time  series  present  such  a  host  of  widely  different 
patterns  that  the  hypotheses  about  their  structure  cannot  adequately 
be  brought  together  into  a  single  parameter  system."  Consequently, 
an  analysis  of  a  time  series  is  not  accomplished  by  adopting  a  single 


Prepared  with  the  partial  support  of  the  Office  of  Naval  Research. 
Reproduction  is  permitted  for  any  purpose  of  the  United  States  Govern¬ 
ment.  To  be  presented  at  the  International  Statistical  Institute, 
September,  1965 • 


1 


model,  the  parameters  of  which  arc  estimated.  Rather,  it  is  best 
carried  out  by  a  process  of  increasing  insight  from  successive 
analyses . 

To  a  sample  (X(t),  t  =  1,  2,  . ..,  Tj  of  a  time  series,  one  can 
associate  a  function,  called  the  sample  spectral  density  function  or 
periodogram,  defined  by 

fT(»)  =  gif  I  f  e'lut  X(t)|2,  -  *  <  0!  <  X  . 
t  —1 

The  periodogram  was  introduced  by  Schuster  to  estimate  the  fre¬ 
quencies  of  strict  periodicities  in  a  time  series  satisfying  the 
following  assumptions:  (i)  it  is  not  evolving  but  is  oscillating 
about  a  constant  level;  (ii)  it  may  be  regarded  as  composed  of  a  number 
of  "strict"  periodicities  plus  purely  random  fluctuations. 

Since  the  model  of  strict  periodicities  plus  random  noise  seems  to 
occur  rarely  in  practice,  the  Schuster  periodogram  often  discovered 
spurious  cycles.  To  remedy  this,  the  notion  of  "disturbed"  periodicity 
was  introduced  by  means  of  autoregressive  models  and  moving  average 
models.  The  correlogram  came  to  the  fore,  and  periodogram  analysis 
fell  into  disfavor,  if  not  disrepute. 

Autoregressive  and  moving  average  models  are  (under  some  additional 
assumptions)  special  cases  of  stationary  time  series.  The  problems  of 
finding  the  order  of  a  finite  parameter  scheme  such  as  autoregressive 
and  moving  average  models  led  time  series  analysts  to  adopt  a  "non- 
parametric"  approach  and  first  estimate  the  spectral  density  function  of 
the  observed  time  series  under  the  assumption  that  it  was  a  stationary 
time  series  with  no  strict  periodicities.  Techniques  of  spectral  analysis 
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(using  smoothed  periodograms)  came  back  into  favor,  when  interpreted  as 
estimates  of  the  spectral  density  function  of  an  underlying  stationary 
time  series. 

But  there  is  more  to  life  than  stationary  time  series  with 
continuous  spectra.  Consequently,  statisticians  added  the  possibility 
of  strict  periodicities  back  to  the  model.  Thus  was  born  the  so-called 
problem  of  mixed  spectra  (see  Hext  (1966)  for  a  current  survey). 

Finally,  the  assumption  that  the  observed  time  series  is  trend 
free  is  unnatural.  If  one  adds  trend  to  a  stationary  time  series,  one 
has  a  time  series  which  can  be  regarded  as  derived  from  a  stationery 
time  series  by  a  filtering  process.  A  theory  of  Fourier  analysis  can 
be  developed  for  such  time  series,  and  one  might  seek  to  estimate  this 
spectrum.  In  our  approach  to  empirical  time  series  analysis,  [see 
Pax-zen  (1966)]  the  emphasis  is  on  the  use  of  spectra  defined  from 
samples  rather  than  from  populations  or  ensembles .  Given  an  observed 
time  series  of  finite  length,  or  a  time  series  derived  from  it,  one 
defines  various  "sample  spectral  functions"  such  as  windowed  sample 
spectral  density  functions  and  distribution  functions.  Their  proper¬ 
ties  can  be  determined  for  each  possible  model  one  desires  to  consider 
for  the  observed  time  series.  Consequently,  they  can  be  used  to  form 
estimates  of  the  parameters  characterizing  the  model.  Further,  they 
can  be  used  to  determine  an  appropriate  model  by  comparing  the  actual 
appearance  of  these  spectral  functions  with  their  expected  appearance 
under  the  various  models;  that  model  for  which  the  correspondence  is 
closest  is  considered  the  most  likely. 

Our  aim  in  this  paper  is.:  (l)  to  summarize  the  basic  formulas 
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employed  in  the  empirical  spectral  analysis  of  a  single  time  series ,  and 
(2)  to  show  their  applicability  to  the  problem  of  analyzing  and 
synthesizing  "adaptive  predictors"  for  tinr  series.  The  empirical 
spectral  analysis  of  multiple  time  series  is  discussed  in  Parzen  (1965) • 

Other  uses  of  spectral  analysis  are  described  in  the  excellent 
survey  paper  of  Jenkins  (1965) . 

2.  Sample  convolution  function 

In  order  to  define  windowed  sample  spectral  density  functions  (or 
smoothed  periodograms )  it  is  convenient  to  first  introduce  bhe  sample 
convolution  function,  denoted  R^( • ) ,  of  an  observed  sample 
(X(fi),  t  -  1,  2,  ....  T): 

T-v 

Rjv)  4  [  X(t)  X(t+v),  v  =  0,  1,  . . T  -  1  , 

X  t=l 

(1)  =  Rt(-v)  ,  v  =  -  1,  ...,  -  (T-l)  , 

=0  ,  otherwise  . 

The  terminology  "sample  convolution  function"  is  not  standard  but 
is  introduced  in  this  paper  in  order  to  reserve  for  other  purposes  the 
terms  "sample  correlation  function"  or  "autocorrelation  function"  which 
are  used  by  other  authors.  In  the  case  that  the  time  series  of  which 
(X(t),  t  =  1,  2,  ...,  T)  is  a  sample  is  known  to  have  zero  means  and  to 
be  covariance  stationary  T/ith  covariance  function  R(  • )  then  R^(v) 
provides  a  possible  estimate  of  R(v) .  Because  of  this,  in  previous 
writings  [see  Parzen  (1964  a),  (1964  b)]  the  author  has  called  the 
function  R^(*)  the  sample  covariance  function.  However,  it  is  our 
belief  that  the  computation  of  R^,(  )  is  of  great  value  even  for  time 
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series  which  are  not  necessarily  covariance  stationary.  It  seems  best 
therefore  when  introducing  this  function  for  the  first  time  to  give  it 
a  name  which  indicates  its  data-handling,  rather  than  statistical, 
character;  such  a  name  is  "sample  convolution  function."  Similarly, 
the  sample  correlation  function  p^(v),  defined  below,  is  from  a  statis¬ 
tical  point  of  view,  an  estimate  of  the  true  correlation  function  p(v) 
of  a  covariance  stationary  time  series  with  zero  means,  while  from  a 
data-handling  point  of  view,  it  is  just  the  convolution  function  multi¬ 
plied  by  a  scale  factor  so  as  to  have  value  1  at  v  =  0. 

The  relations  that  exist  between  the  sample  convolution  function 
and  sample  spectral  density  function  of  an  observed  time  series  are  the 
same  as  those  that  exist  between  the  covariance  function  and  spectral 
density  function  of  a  covariance  stationary  time  series.  In  particular, 
we  note  the  following  facts. 

The  sample  convolution  function  R^,(v)  and  the  sample  spectral 
density  function  f^(o))  are  both  even  functions  of  their  arguments  and 
are  a  Fourier  transform  pair: 

r  ^ 

dw  =  2  j  cos  vo)  f^(o))  do)  , 


P  ^ 

R^v)  =  /  cos  vo)  fT(u)) 


(2) 


2jt'RT^0^  +  * 


T-l 

£  cos  vo)  R  (v)  . 
v=l 


The  sample  distribution  function.  Given  an  observed  time  series 
{X(t),  t  =  1,  <?,  ...,  T),  the  sample  distribution  function  FT(W)  is 
a  function  of  o)  in  the  interval  0  <  to  <  n  defined  by 
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(3) 


to 

ft(w)=2  fT(<J)  aw 


=  2  B  (0)  +  i  r1  SiSJSl  K  (v) 

!t  f  '  Jt  V  T'  ' 

V= 1 


Conversely,  R  (v)  is  the  Fourier-Stieltjes  transform  of  Fm(u); 


(4) 


n  ^ 

RrpCv)  =  J  cos  vw  dPT(w) 


The  spectral  distribution  function  is  a  monotone  increasing  function 
of  a).  Consequently  it  fluctuates  much  less  than  the  sample  spectral 
density  function  f_(wf'.  This  is  both  a  virtue  and  a  vice.  Certain  real 
effects  which  it  is  the  aim  of  the  investigation  to  discern  will  show  up 
most  clearly  in  the  spectral  density  function  whereas  they  may  be  over¬ 
looked  in  the  spectral  distribution  function.  On  the  other  hand, 
certain  specious  effects  may  appear  to  show  up  in  the  spectral  density 
function  which  on  the  basis  of  the  spectral  distribution  function  may  be 
rejected  as  pure  fluctuation. 

Sample  correlation  function.  The  sample  correlation  function, 
denoted  p^( • ),  is  defined  by 

/  v  Vv) 

(5)  PT(v)  =  R^0y  • 

In  words,  p^(v)  is  the  sample  convolution  function  normalized  to  have 
value  1  at  v  =  0. 

Normalized  sample  spectral  density  function.  For  ease  of  comparing 
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the  sample  spectral  density  functions  arising  from  different  time  series, 
it  seems  best  to  compute  and  plot  normalized  versions  of  these  functions. 
Since 

(6)  B  (0)  =  /  fT(w)  do)  , 

J~  v/ -jt 

the  natural  normalization  of  f^(w)  is 

fT(w) 

(T)  fT(w)  = 

which  has  the  property  that  its  integral  form  -jt  to  n  equals  1. 

We  call  frp(w)  the  normalized  spectral  density  function;  note  that  it 
is  also  the  spectral  density  function  of  the  sample  correlation  function, 


(8) 


PT(v)  =  [*  eivW  fT(w)  dw  . 

'j  -it 


3 •  Windowed  sample  spectral  density  and  distribution  functions 

The  windowed  sample  spectral  density  function,  denoted  f^  ^(co)  , 
is  defined  by  (for  -  ir  <  w  <  jt) 


(1) 


(“) 


where  R^(')  is  the  sample  convolution  function. 

The  windowed  normalized  sample  spectral  density  function,  denoted 
fm  w(w),  is  defined  by  (for  -  it  <  w  <  ir) 


(2) 


fT,M^ 


=  ®  |v|  ^  H  °0S  V“  ^  ^ 
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where  p(l1(  • )  is  the  sample  correlation  function. 

The  windowed  normalized  sample  spectral,  distribution  function, 
denoted  Fm  M(w),  is  defined  by  (for  0  <  w  <  it) 

JLjM 


(3) 


Lag  Windows.  The  function  k( • )  is  known  as  the  lag  window  of 

the  windowed  spectrum.  In  our  work  we  use  mainly  the  following  lag 
window 

k(u)  =  1  -  6u2  +  6|up  ,  |u|  <  0.5 

(4)  =  2 ( 1—  | u |  )3  ,  0.5  <  I u|  <  1.0 

=  0  ,  |u|  >  1  . 


A  kernel  widely  used  in  existing  spectral  analysis  programs  is  one 
suggested  by  Tukey  (see  Blackman  and  Tukey  (1958),  p.  14): 


k(u)  =  ~  (1  +  cos  jtu)  ,  |u|  <  1  , 

(5) 

=  0  ,  otherwise  . 

This  lag  winaow  is  not  used  in  our  work  because  the  corresponding 
windowed  spectrum  is  not  necessarily  non-negative  (and  the  corres¬ 
ponding  estimates  of  coherence  are  not  necessarily  between  0  and  1). 
Truncation  Points.  The  integer  M(<  T)  is  called  the  truncation 


-  —  X 
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point  of  the  windowed  spectrum  since  it  represents  the  number  of  sample 


correlations  of  the  T  available  actually  used  in  computing  the 
spectrum.  It  is  wise  to  choose  several  truncation  points  in  practice. 
In  our  computations,  we  usually  choose  three  truncation  points 
ML^,  Mg,  lAj  as  percentages  of  T: 

M.  Mp  M, 

5*  <  -f  <  lost,  10*  <  ~  <  25*,  25*  <  -f  <  75*  • 

An  alternative  rule  is: 


5*  T  <  Mx  <  10*  T,  2Mj  <  Mg  <  3*i>,  2Mg  <  <  3Mg  . 

Spectral  Computation  Humber,  There  is  a  third  choice  to  be  made  in 
forming  the  estimate  f^  M(w),  and  this  is  the  number  of  points  on  the 
interval  0  to  it  at  which  it  will  be  computed.  We  adopt  the 
attitude  that  ^(w)  should  be  computed  for  equispaced  frequencies 


to 
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where  Q  is  an  integer  to  be  chosen.  We  call  Q  the  spectral  computa¬ 
tion  number. 

In  the  past  Q  has  frequently  been  chosen  to  be  equal  to  the 
truncation  point  M.  One  can  prove  a  sampling  theorem  to  the  effect 
that  the  estimated  spectrum  (which  is  a  function  of  w,  measured  in 
cycles  per  unit  of  observation  time,  in  the  interval  0  <  w  <  0.5) 
can  be  recovered  from  its  value  at  M  equally  spaced  points.  However, 
this  recovery  cannot  necessarily  be  done  by  linear  interpolation.  If 
the  graph  of  the  estimated  spectrum  is  to  be  obtained  by  merely  drawing 
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line  segments  connecting  the  computed  values,  one  needs  to  compute  the 
spectrum  at  Q  equi-spaced  frequencies,  where  Q  should  be  at  least 
2M  and  perhaps  should  be  4M  (note:  further  research  is  needed  on 
this  point) . 

If  one  uses  3  truncation  points  ,  it  has  seemed 

reasonable  to  me  to  compute  each  spectrum  at  Q  =  points.  However, 
one  should  choose  Q  (approximately  equal  to  M^)  such  that  the 
frequencies  which  are  multiples  of  rt/Q  are  of  physical  interest.  For 
economic  time  series  of  monthly  data  wu  usually  choose  Q  to  be  a 
multiple  of  12. 

Spectral  Window.  The  spectral  window  of  the  windowed  spectrum 
defined  by  (2)  is  defined  to  be  the  function 


(6) 


\X 

I  v  I  <  M 


V“>  =  Su4..elWk<i) 


For  the  lag  window  (4),  it  may  be  shown  that 


(7) 


This  is  an  even  function  which  integrates  to  1,  has  maximum  value 


(8)  yo  =  ^  m 

and  achieves  its  first  zero  at  w  =  kr,/lA.  It  is  thus  concentrated 
about  w  =  0  with  a  rectangular  bandwidth  8rt/3M  in  radians  and 
4/3M  in  cycles  per  unit  time. 

In  order  to  understand  the  name  "spectral  window,"  we  must  first 
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note  the  relation  that  exists  between  the  sample  spectral  density- 
function  and  the  windowed  spectrum  f^  ^(w): 

(9)  „(<■>)  =  f  ^ 

w  -Jt 

since 


(“>  =  & 


I 

|v|  <  M 


-ivw 


r 


jt 


'j  -jt 


•l*  fj(M 


dX 


Thus  M(w)  is  the  convolution  of  fT(w)  and  y  w) .  In 

other  words ,  f^  M(to)  is  an  averaging  over  the  values  of  f^(w) 
when  it  is  viewed  through  a  window  (or  channel)  of  variable  trans¬ 
mission  properties  given  by  K^w). 

A  useful  approximation  to  y  w)  can  be  obtained  by  introducing 
the  Fourier  transform 

(11)  K(w)  =  ~  \  e“luW  k(u)  du 

<J  -00 

which  we  call  the  spectral  window  generator.  It  may  be  shown  that 
approximately 

(12)  y«)  =  M  K(Mo) 
since  exactly 

00 

(13)  y«)  =  M  [  K(M(w-2jtj)) 

i=° 0 
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From  this  basic  formula  we  obtain  the  formula  (7)  for  K^w) .  For 
the  lag  window  (4),  the  spectral  window  generator  may  be  shown  to  be 


k 


Then 

(15) 

approximates  K^(w)  given  by  (7). 

Rather  than  giving  a  theoretical  discussion  of  the  properties  of 
windowed  sample  spectra  we  illustrate  their  use  by  analyzing  a  time 
series  which  has  been  extensively  discussed  from  the  point  of  view  of 
forecasting.  This  is  a  monthly  series  of  international  airline  passen¬ 
ger  bookings ,  l<?l|-9-196l;  compare  Brown  (1963),  P*  ^29  and  Barnard 
(1963). 

It  is  to  be  emphasized  that  there  is  not  a  uniquely  best  way  in 
which  spectral  analytic  ideas  enter  into  time  series  analysis  once  one 
drops  the  assumption  that  one  is  dealing  with  a  stationary  time  series. 
The  attitudes  to  data  analysis  presented  in  this  paper  should  be  used 
in  conjunction  with  other  attitudes  such  as  computing  time  varying 
spectra. 

Analysis  of  an  empirical  time  series 

The  monthly  time  series  of  international  airline  passengers  has 
the  characteristic  features  of  many  social  and  economic  time  series;  in 
particular,  there  is  an  upward  trend  and  a  seasonal  variation.  In 
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figure  1  we  give  plots  of  three  windowed  sample  spectral  density  func¬ 
tions  (computed  for  truncation  points  72 ,  $6,  and  18  on  a  series  of 
length  144) .  The  existence  of  trend  is  evidenced  by  peaks  at  frequency 
0,  while  the  existence  of  seasonal  variation  is  evidenced  by  peaks  at 
the  seasonal  frequencies  .083,  .167,  .25;  -33;  and  .42  cycles  per 
month . 

In  order  to  gain  insight  into  the  structure  of  a  time  series,  we 
often  seek  to  find  the  coefficients  of  the  minimum  mean  square  error 

A 

linear  predictor  X(t)  of  the  time  series  X(t)  of  the  form 

(1)  X(t)  =  &  X(t-l)  +  ...  +  am  X(t-m) 

There  are  three  ways  in  which  one  can  fit  an  autoregressive  scheme  to 
data:  (i)  one  can  specify  the  order  m  and  estimate  the  coefficients 
ai  by  solving  the  system  of  linear  equations 

(2)  EiXit)  X(t-i)  ]  =  E[X(t)  X(t-i)j,  i  =  1,  2,  . ..,  m  ; 

(ii)  one  can  take  the  possible  order  m  to  be  some  large  number  (such 

as  50  months)  but  admit  only  those  lags  whose  coefficients  a^  are 

"significantly"  different  from  zero;  (iii)  one  can  take  the  possible 

order  m  to  be  some  large  number  but  solve  for  the  coefficients  e._, 

1 

in  order  of  decreasing  contribution  to  the  residual  sum  of  squares, 
and  use  only  a  specified  number  of  coefficients.  Applying  these  proce¬ 
dures  to  the  airline  passenger  series  we  find  the  following  results: 

(i)  if  one  fits  a  13th  order  scheme, 
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x^t)  =  0.985  X(t-l) 

+  0.248  X(t-ll) 

-  0.346  x(t~23) 

-  0.234  x(t-6) 

+0.190  X(t-12) 

+  0.068  X(t-2) 

(3)  -  0.080  x(t-3'> 

+  0.050  x(t-4) 

-  0.007  X(t-8) 

+  0.019  x(t-5) 

-  0.008  x(t-7) 

+  0.012  X(t-10) 

-  0.011  x(t-9) 

The  coefficients  are  -written  in  order  of  decreasing  contribution  to 
mean  square  prediction  error.  Next  let  us  seek  only  the  coefficients 
making  a  "significant"  contribution  to  the  residual  sum  of  squares;  one 
would  then  fit  a  first  order  predictor 

(4)  ^(t)  =0.986  X(t-l)  . 

» 

Finally  let  us  arbitrarily  choose  to  fit  the  best  fitting  3  terms;  one 
obtains  the  predictor 

^(t)  =  0.957  X(t-l) 

+  0.257  X(t-ll) 

-  0.224  X(t~13) 
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(5) 


The  means  and  variances  of  the  original  series  and  the  residual 


series  are  as  follows: 


Series 

x[ty 

€x(t)  =  X(t)  -  Xx(t) 

e2(t)  =  X(t)  -  Xg(t) 

e3(t)  =  X(t)  -  X^t) 


Variance 


ruccui 

2Bo“~ 

1.43 

10  + 

5-36 

5.13 

102 

5.39 

1.11 

105 

4.81 

6.28 

102 

Next  one  examines  the  spectra  of  the  residual  series  e(t)  =  X(t)~X(t). 
The  windowed  sample  spectral  density  function  and  spectral  distribution 
function  of  e^t)  is  plotted  in  the  top  half  of  figures  2  and  5i 
respectively;  the  trend  in  the  original  X(t)  series  has  been  eliminated, 
but  the  seasonal  peaks  remain.  The  speetra  of  «x<t)  and  ^(t)  ere 
similar  except  that  e2(t)  has  stronger  seasonal  peaks. 

We  next  repeat  the  autoregressive  model  fitting  procedures  on  the 
residual  €g(t)  and  e3(t).  We  find  (applying  procedure  2  to  e2(t)} 


(6) 

e2(t)  “ 

0.851 

e2(t-12) 

while  (applying  procedure  5  to 

€^(t)) 

e^(t)  = 

0.657 

e5(t-l2) 

(7) 

+  0 . l4l 

e3(t~24) 

-  0.080 

e  (t-20) 

In  order  to  interpret  the  properties  of  the  residuals 

A 

Tj(t )  =  e(t)  -  e(t)  let  us  compare  them  with  the  forecasting  errors 
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given  by  Barnard  (1963)  in  his  comparison  of  the  "adaptive  forecasting" 
and  Box- Jenkins  method  Data  for  19^9  and  1950  were  used  to  provide 
initial  values,  so  that  the  forecast  errors  were  given  only  for  the  10 
years,  1951-1960.  For  comparison  we  computed  the  spectra  of  the 

A  A 

forecasting  errors  q2(t)  ^(t)  over  this  10-year  period. 


Series  of  forecasting  errors,  1951-1960 

Mean 

Variance 

Adaptive  forecasting  method 

.28 

I.78 

102 

Box-Jenkins  method 

-  .36 

1.83 

102 

T^(t)  =  e2(t)  -  e2(t) 

1.52 

2.05 

102 

Tk(t)  =  e,(t)  -  e  (t) 

1.76 

1.75 

102 

The  spectra  of  the  forecasting  errors  arising  from  adaptive  fore¬ 
casting  and  the  Box- Jenkins  method  are  very  different!  They  both  are 
far  from  the  spectrum  of  white  noise,  but  the  adaptive  forecasting  errors 
are  predominantly  low  frequency  while  the  Box-Jenkins  forecasting  errors 
are  predominantly  high  frequency;  their  sample  windowed  spectral  density 
functions  and  spectral  distribution  functions  are  plotted  in  figures  t 
and  respectively,  for  a  Parzen  window  and  in  figures  6  and  7  for  a 
Tukey  window. 

In  the  bottom  half  of  figures  2  and  3  we  plot  the  sample  spectrum 
of  T)^(t);  it  is  essentially  the  spectrum  of  white  noise. 

The  foregoing  considerations  lead  to  both  a  model  and  a  forecasting 
formula  for  X(t).  Let  be  the  r-th  backward  shift  operator, 

Uj,X(t)  =  X(t-r),  and  let  I  be  the  identity  operator.  Define 


(8) 


Px  -  0.957  \  +  0.257  Uu  -  0.27^  , 

P2  =  0.657  -  0.080  u20  +  o.ibi  U2lf  . 

Then  there  is  a  white-noise  series  r)(t)  such  that 

(9)  (I-P2)  (I-P^  X(t)  =  T]( t ) 

Therefore  a  predictor  of  X(t)  is  given  by 

(10)  x(t)  =  (px  +  p2  -  p2px)  X(t) 

A 

More  generally,  let  X(t+v)  denote  the  predictor  of  X(t+v)  given 
values  of  the  time  series  up  to  time  t.  Then 

(11)  X(t+v)  =  l?x  +  P2  -  P2P1)  X(t+v)  ; 

note  that  X(s)  =  X(s)  if  s  <  t  . 

The  aim  of  the  foregoing  discussion  has  been  to  show  one  important 
use  of  empirical  spectral  anaJysis;  given  an  operator  (such  as  I  - 
the  properties  of  the  time  series  (l  -  P^)  X(t)  can  be  studied  without 
regard  to  the  procedure  by  which  one  formed  the  operator. 
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